AITopics | state action value

Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

Neural Information Processing SystemsMar-17-2026, 14:38:16 GMT

Under/overestimation of state/action values are harmful for reinforcement learning agents. In this paper, we show that a state/action value estimated using the Bellman equation can be decomposed to a weighted sum of path-wise values that follow log-normal distributions. Since log-normal distributions are skewed, the distribution of estimated state/action values can also be skewed, leading to an imbalanced likelihood of under/overestimation. The degree of such imbalance can vary greatly among actions and policies within a single problem instance, making the agent prone to select actions/policies that have inferior expected return and higher likelihood of overestimation. We present a comprehensive analysis to such skewness, examine its factors and impacts through both theoretical and empirical results, and discuss the possible ways to reduce its undesirable effects.

artificial intelligence, machine learning, reinforcement learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

Neural Information Processing SystemsNov-21-2025, 15:28:51 GMT

Under/overestimation of state/action values are harmful for reinforcement learning agents. In this paper, we show that a state/action value estimated using the Bellman equation can be decomposed to a weighted sum of path-wise values that follow log-normal distributions. Since log-normal distributions are skewed, the distribution of estimated state/action values can also be skewed, leading to an imbalanced likelihood of under/overestimation. The degree of such imbalance can vary greatly among actions and policies within a single problem instance, making the agent prone to select actions/policies that have inferior expected return and higher likelihood of overestimation. We present a comprehensive analysis to such skewness, examine its factors and impacts through both theoretical and empirical results, and discuss the possible ways to reduce its undesirable effects.

log-normality and skewness, name change, state action value, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

Liangpeng Zhang, Ke Tang, Xin Yao

Neural Information Processing SystemsNov-21-2025, 10:21:57 GMT

Under/overestimation of state/action values are harmful for reinforcement learning agents.

machine learning, reinforcement learning, skewness, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > West Midlands > Birmingham (0.04)

Genre: Research Report (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reviews: Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

Neural Information Processing SystemsOct-7-2024, 23:22:47 GMT

This paper focuses on the problem arising from skewness in the distribution of value estimates, which may result in over- or under-estimation. With careful analysis, the paper shows that a particular model-based value estimate is approximately log-normally distributed, which is skewed and thus leading to the possibility of over- or under-estimation. It is further shown that positive and negative rewards induce opposite sort of skewness. With simple experiments, the problem of over/underestimation is illustrated. This is an interesting paper with some interesting insights on over/underestimation of values.

log-normality and skewness, reinforcement learning, state action value, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

Liangpeng Zhang, Ke Tang, Xin Yao

Neural Information Processing SystemsOct-3-2024, 13:31:38 GMT

Under/overestimation of state/action values are harmful for reinforcement learning agents. In this paper, we show that a state/action value estimated using the Bellman equation can be decomposed to a weighted sum of path-wise values that follow log-normal distributions. Since log-normal distributions are skewed, the distribution of estimated state/action values can also be skewed, leading to an imbalanced likelihood of under/overestimation. The degree of such imbalance can vary greatly among actions and policies within a single problem instance, making the agent prone to select actions/policies that have inferior expected return and higher likelihood of overestimation. We present a comprehensive analysis to such skewness, examine its factors and impacts through both theoretical and empirical results, and discuss the possible ways to reduce its undesirable effects.

log-normal distribution, skewness, state value, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > West Midlands > Birmingham (0.04)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

Zhang, Liangpeng, Tang, Ke, Yao, Xin

Neural Information Processing SystemsFeb-14-2020, 08:58:18 GMT

Under/overestimation of state/action values are harmful for reinforcement learning agents. In this paper, we show that a state/action value estimated using the Bellman equation can be decomposed to a weighted sum of path-wise values that follow log-normal distributions. Since log-normal distributions are skewed, the distribution of estimated state/action values can also be skewed, leading to an imbalanced likelihood of under/overestimation. The degree of such imbalance can vary greatly among actions and policies within a single problem instance, making the agent prone to select actions/policies that have inferior expected return and higher likelihood of overestimation. We present a comprehensive analysis to such skewness, examine its factors and impacts through both theoretical and empirical results, and discuss the possible ways to reduce its undesirable effects.

log-normality and skewness, reinforcement learning, state action value, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

Zhang, Liangpeng, Tang, Ke, Yao, Xin

Neural Information Processing SystemsDec-31-2017

Under/overestimation of state/action values are harmful for reinforcement learning agents. In this paper, we show that a state/action value estimated using the Bellman equation can be decomposed to a weighted sum of path-wise values that follow log-normal distributions. Since log-normal distributions are skewed, the distribution of estimated state/action values can also be skewed, leading to an imbalanced likelihood of under/overestimation. The degree of such imbalance can vary greatly among actions and policies within a single problem instance, making the agent prone to select actions/policies that have inferior expected return and higher likelihood of overestimation. We present a comprehensive analysis to such skewness, examine its factors and impacts through both theoretical and empirical results, and discuss the possible ways to reduce its undesirable effects.

machine learning, reinforcement learning, skewness, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Filters

Collaborating Authors

state action value

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

Reviews: Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning

Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning